Flexible access of data stored in a database

ABSTRACT

A method and system for processing a request to retrieve data from a database is disclosed. The method of the present invention comprises receiving the request which comprises a function that processes at least one data item specified in the request, retrieving the at least one data item from a database, and invoking the function. When the function is invoked, at least one parameter comprising information related to the at least one data item is passed to the function. Thus, a function can be written independently of the data which they are called to process.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims benefit under 35 USC §119 of UnitedKingdom Application No. 0209957.0, filed on May 1, 2002.

FIELD OF THE INVENTION

[0002] The present invention relates to improved flexibility ofretrieval of data stored in a relational database.

BACKGROUND OF THE INVENTION

[0003] Relational databases are widely used in industry to store, interalia, commercial data, such as client details and product details. ARelational Database Management System (RDBMS) is software which enablesa user to define, construct and manipulate a database using a high levellanguage. Examples of RDBMS's are DB2® of IBM®, Informix® DynamicServer™, Microsoft® SQL Server and Oracle®.

[0004] Structured Query Language (SQL) is a standard high levellanguage, specified by the American National Standards Institute (ANSI),and provides for data definition and manipulation. It is provided bymost RDBMS providers, although each provider may provide extensions tothe basic standard language. Probably the most widely used SQL statementis the “Select” statement which is used to retrieve data from adatabase. This is probably best illustrated by a simple example.Consider a table named “EmployeeTable”, defined in a database, whichcontains 3 rows of 3 columns of data items, each row containing dataitems containing the name, number and salary of each employee. Anexample of the contents of such a database are: Name (varchar(20))Number (char(7)) Salary (integer) Alexandra 0012345 35,000 Sean 002234630,000 Rebecca 0032347 30,000

[0005] Note that each column has a defined data type: “Name”, defined asVarchar(20), is a variable length null terminated character string of upto 20 characters; “Number”, defined as char(7), is a fixed lengthcharacter string of 7 bytes; and “Salary”, defined as integer, is anumber.

[0006] A “Select” statement can now be used to retrieve data items fromthis table. For example, the simplest form of the “Select” statement is:

[0007] Select * From EmployeeTable;

[0008] This is used to obtain a list of all data items in the table. Theexact format of the output may vary according to which RDBMS is used butmay, for example, be: Name Number Salary Sean 0022346 30,000 Alexandra0012345 35,000 Rebecca 0032347 30,000

[0009] While this style of output is not necessarily produced by anyRDBMS it will be used for the remainder of this specification by way ofexample only.

[0010] An example of a more selective “Select” statement is:

[0011] Select Name, Salary From Employee Table Where Salary>32000;

[0012] This would return the name and salary of all employees with asalary of more than 32,000. Note, however, that this statement requiressome knowledge of the “EmployeeTable” as the names of the columns “Name”and “Salary” are specified. In this example the output would be: NameSalary Alexandra 35,000

[0013] Now, suppose that the salary output is required in US Dollarsalthough the “Salary” column in “EmployeeTable” is entered in sterling.One way of achieving this is for the user to define, to the RDBMS, afunction for converting sterling to US dollars which can be used as partof a select statement. Such a function is known as a User DefinedFunction (UDF) in DB2. This function could, for example, be called“Sterling_to_USDollars”, take as input an integer (the data type of“Salary”), and output the resulting calculation as a string containing“$” followed by the converted salary. It would then be possible tospecify a “Select” statement using this function:

[0014] Select Name, Sterling_to_USDollars(Salary) From EmployeeTable

[0015] Where Salary>32000;

[0016] This statement, assuming an exchange rate of 1.4, would thenproduce output: Name Salary Alexandra $49,000

[0017] Further, suppose the company uses the last four characters of the“Number” column to contain the telephone extension of each employee, andrequires an output to contain the telephone extension instead of theemployee number. This can also be done using a function defined to theRDBMS which could, for example, be called “Number_to_Phone”, take asinput a 7 character field, and output the last 4 characters of the“Number” field preceded by an “x”. This now enables a select statementof:

[0018] Select Name, Number_to_Phone (Number),Sterling_to_USDollars(Salary)

[0019] From EmployeeTable;

[0020] This would produce the output: Name Number Salary Sean x2346$42,000 Alexandra x2345 $49,000 Rebecca x2347 $42,000

[0021] Note that it is possible to combine the functionality of“Sterling_to_USDollars” and “Number_to_Phone” into a single functiontaking two input parameters (“Salary” and “Number”).

[0022] It can be noted from these examples that a “Select” statementwhich is more selective than “Select *”, requires some knowledge of“EmployeeTable”. For example the user must know the column names(“Name”, “Number”, “Salary”) and that data type of “Salary” (for “WhereSalary>32000”). Further UDF's require some knowledge of the data typetaken as input, for example “Sterling to_USDollars” must be written toaccept the “Salary” input as an integer.

[0023] In fact any method of producing output which is not a full andunmodified listing of the database as produced from “Select *” requiressome knowledge of the database. Whilst such knowledge can be removedfrom the select statement by writing a UDF function that takes an entirerow as input, the UDF must be written with a parameter list that definesthe correct column data types in the correct order. This is clearlyrestrictive on the user, making it impossible to write a UDF that is nothard coded to accept a specific data type or specific data types in aspecific order.

[0024] Therefore, there is a need for a more flexible method and systemfor retrieving data using a UDF. The present invention provides such amethod and system.

SUMMARY OF THE INVENTION

[0025] The present invention provides method and system for processing arequest to retrieve data from a database. The method of the presentinvention comprises receiving the request which comprises a functionthat processes at least one data item specified in the request,retrieving the at least one data item from a database, and invoking thefunction. When the function is invoked, at least one parametercomprising information related to the at least one data item is passedto the function.

[0026] According to the method and system of the present invention, adatabase engine can enable a user defined function to be used whenretrieving data items from a database, whereby the user defined functiondoes not require specific knowledge of the type(s) of the data items onwhich it is to operate. Thus, a function can be written independently ofthe data which it is called to process.

[0027] According to a first aspect, the at least one data item has anassociated value and type and the at least one parameter comprises thevalue and an indication of the type of the at least one retrieved dataitem. Preferably the information passed to the function, in the at leastone parameter, also comprises the names of each of any data items it ispassed. The name of a data item will normally be the name associatedwith a column in a table from which the data item was retrieved. As aresult, if the function needs to be aware of the name associated withthe data items on which it operates, it can obtain this information fromthe information it is passed.

[0028] A further advantage can be achieved by enabling one or morevariables to be passed to the function. Preferably a name and value forone or more variables can be included in the request, for passing to thefunction. These variables are then passed, in the information passed tothe function, in the same way as a data item retrieved from the databaseand assuming a default type. For example, assuming a default type ofnull terminated string, the specification of a variable in the format “$as Currency” would result in the information passed to the functionincluding an item with the name “Currency”, a value of “$”, and a typeof null terminated string. Optionally specification of a variable caninclude a type to be used instead of the default type.

[0029] Where a data retrieval request requests data items from a tablewhich comprises a plurality of rows, each row comprising a plurality ofdata items, an RDBMS could implement the present invention by passinginformation pertaining to all data items from the plurality of rows in asingle invocation of the function. However, this would only be usefulfor a simple retrieval request involving, for example, no otherfunctions or conditions. As a result it is preferable to call thefunction once for the data items contained in each row of data that isretrieved from the table, according to the retrieval request.

[0030] It can be useful for a function to enclose any output it producesas a result of processing all of data items specified for retrieval bythe request with, for example, a leading string and a trailing string.This is problematic if the function is called more than once as part ofprocessing the request and as a result it is preferable that thefunction is called in a way that enables it to do this. One method ofachieving this is to initially invoke the function before passing it anydata items, with an indication that it is the first invocation (and inthe next invocation it will be passed one or more data items) and tofinally invoke the function, after all data items have been passed, withan indication that this is the last call (and no more data items will bepassed). The initial and final calls do not pass any data items, but maypass variables. Alternatively, assuming that the function is invokedmore than once to process data items, the first and last callindications can be included in the first and last calls, respectively,to the function which pass data items. Further note that the requests toretrieve data can specify that the data items are to be passed to therow function in one or more groups based on one or more data items. Inthis case the row function can be called to indicate the first and lastcalls to the row function for each group of data items and the rowfunction may therefore be called more than once with last and first callindications in the processing of a single request.

[0031] Preferably the request can also include a value which indicatesthat all data items from each row of a table are to be retrieved, forexample “*” as in “Select *”. Preferably this can be done when therequest further includes one or more variables to be passed to thefunction. All data items from each row are then passed to the functionon each invocation in addition to any variables. This is advantageousbecause it enables a function to be written which can be used fordifferent tables and modify its behaviour based on the value of thevariable passed to it.

[0032] Preferably an RDBMS allows the function to be pre-registered sothat when it is specified in a retrieval request the RDBMS can easilycheck the request for valid use of the function. Alternatively, forexample, the RDBMS can locate the function based on a search path.

[0033] Preferably the output accepted by the RDBMS, from the functioncalled, is a character string. This may be, for example, a return valuefrom the function or an output parameter further included in theparameter list. This enables a row function to be used in the same wayin retrieval requests as, for example, prior art user defined functionsas supported by DB2.

BRIEF DESCRIPTION OF THE DRAWINGS

[0034] The invention will now be described, by way of example only, withreference to a preferred embodiment thereof, as illustrated in theaccompanying drawings, in which:

[0035]FIG. 1 is a block diagram of a data processing environment inwhich the preferred embodiment of the present invention can beadvantageously applied;

[0036]FIG. 2a is an example of the contents of a simple table defined ina database;

[0037]FIG. 2b is an example of the type of output that may be requiredfrom a user defined function;

[0038]FIG. 2c is an example of the type of output that may be requiredfrom a user defined function;

[0039]FIG. 3 is a flow diagram of receiving a request to register a rowfunction;

[0040]FIG. 4 is a flow diagram of processing a request to retrieve dataitems from a database; and

[0041]FIG. 5 is a flow diagram of calling a row function during theprocessing of a request to retrieve data items from a database.

DETAILED DESCRIPTION

[0042]FIG. 1 is a block diagram of a data processing environment inwhich the preferred embodiment of the present invention can beadvantageously applied. In FIG. 1, a client/server data processing host10 is connected to other client/server data processing hosts 12 and 13via a network 11, which could be, for example, the Internet. An RDBMS,such as DB2, may be installed on any such client/server and acceptrequests to retrieve data from any client/servers, including the one onwhich the RDBMS is installed. Client/server 10 has a processor 101 forexecuting programs that control the operation of the client/server 10, aRAM volatile memory element 102, a non-volatile memory 103, and anetwork connector 104 for use in interfacing with the network 111 forcommunication with the other client/servers 12 and 13.

[0043] The preferred embodiment will be described with reference to DB2and SQL although the present invention equally applies to other RDBMS'sincluding those that use an alternative to SQL. DB2 provides support forUser Defined Functions (UDFs) which are generally written by usersalthough they can also be provided by the RDBMS provider or even a 3rdparty. A UDF is used in performing operations within an SQL statementand is useful for such things as transforming data values, performingcalculations on one or more data values and extracting parts of a value.A UDF can be written in a high level language, for example C, C++ orJava, although C will be used in the preferred embodiment. Once written,a UDF is defined to DB2 and then, when used, is effectively run as partof the DB2 engine. This provides a performance advantage and enablessimplification of the processing of SQL requests by users andapplications. Whilst the preferred embodiment is discussed in terms of aUDF, other RDBMS's may use a different term to refer to an equivalentfeature.

[0044] The present invention is best illustrated by example and is bestfirst viewed from the perspective of the user. Consider the databasetable previously discussed and as illustrated in FIG. 2a. This figureshows the employee table which has been populated with three rows ofdata, each row being for a different employee, namely Alexandra, Seanand Rebecca. Each row of data comprises three columns which contain thename, employee number and salary for each employee. Note that theheadings (201) shown in FIG. 2a are not a row of data in the table butwill have been defined when the table was created and are included inthis figure for illustration purposes. These show that the “Name” columnis defined as a variable length character string of up to 20 bytes, the“Number” column is defined as a fixed length character string of 7 bytesand the “Salary” column is defined as an integer.

[0045] Assume, from this table, that the user requires to output an XMLstring for each employee, as illustrated in FIG. 2a. This can beachieved using a user defined function (UDF). Consider first a prior artUDF named “Employee_to_XML” and which is used in the following “Select”Statement:

[0046] SELECT Employee_to_XML(*) FROM EmployeeTable;

[0047] A UDF is called once for each row of data and so “Employeeto_XML” must accept, as input, data items of the required type for eachrow. For example, in C, the function prototype could be: void SQL_API_FNEmployee_to_XML ( char *name, /* input parameter value of name  */char *number, /* input parameter value of number */ int *salary, /*input parameter value of salary  */ char *out_XML_String /* output fieldcontaining XML string */ );

[0048] Based on this prototype and assuming that the function hasknowledge of the column names (i.e., “Name”, “Number” and “Salary”) itwould be obvious to a person skilled in the art how to code the requiredUDF to write the output shown in FIG. 2b into “out_XML_String”.

[0049] However, if it was required to write the prior art UDF withoutknowledge of the column names these could be passed as parameters intothe UDF. In this case the column names would need to be passed into theroutine and this would require a modification of the “select” statement,for example:

[0050] SELECT Employee_to XML(“Name”, Name, “Number”, Number, “Salary”,Salary)

[0051] FROM EmployeeTable;

[0052] And based on this “Select” statement the C function prototype ofthe UDF would be: void SQL_API_FN Employee_to_XML (char  *nameColString, char  *name, char  *numberColString, char *number,char  *salaryColString, int   *salary, char *out_XML_String );

[0053] Once again, from this prototype it would be obvious to a personskilled in the art how to code the required UDF to write the outputshown in FIG. 2b into “out_XML_String”.

[0054] Now consider the definition of a UDF which can be used with anRDBMS that exploits the present invention. According to the presentinvention the RDBMS must be able to pass to the UDF details of dataitems from a table in a manner that requires no prior knowledge of theirtypes. According to the preferred embodiment this is achieved by theRDBMS defining a C structure: struct SQLUDF_ROWDATA { int  *type, /*pointer to indicator of column data type */ void *value, /* pointer tovalue of column value     */ char *name /* pointer to string containingcolumn name */ };

[0055] This structure contains information pertaining to a single dataitem, and as a result, details of a number of data items can be passedto a UDF in an array of these structures. Such an array will contain oneelement (structure) for each data item passed to the routine from agiven row. Each element contains 3 members: “type” is an indicator ofthe data type, for example “1” could indicate Integer and “2” couldindicate a null terminated character string; “value” is the value of thedata and must be interpreted according to “type”; and “name” containsthe column name as defined when the table was created and is a nullterminated character string. The order of the data items in the arraywill be the same as specified in the “select” request or, if “*”(meaning all columns) was specified, the same as defined in the table.As a result a UDF can be defined by the C function prototype: voidSQL_API_FN to_XML ( int  numdata,    /* number of elements in array */SQLUDF_ROWDATA[] *rowdata,   /* array of data items */char  *output_XML_String /* output field for XML string */  );

[0056] The UDF is therefore passed an array of “SQLUDF_ROWDATA”structures which is addressed by variable “rowdata”. From this it wouldbe obvious to a person skilled in the art how to write the code to loadthe required output into the output variable “out_XML_String”.

[0057] Thus, according to the present invention it is possible to definean new type of UDF which takes, according to the preferred embodiment,an array of SQLUDF_ROWDATA structures as input. Note, for the remainderof this specification this new type of UDF will be referred to as a “rowfunction” in order to distinguish from the prior art UDF. Furtherbecause a row function, according to the preferred embodiment, producesthe same output (a single character string as specified in“output_XML_String”) as a prior art UDF it can be used in any selectstatement where a prior art UDF can be used. However, in otherembodiments the output of a row function could be different.

[0058] A row function is extremely versatile because, unlike a prior artUDF, it is not restricted to being passed a specific number of dataitems, and specific data types in a specific order. For example considerthe following “Select” statements: 1. SELECT to_XML(*) FROMEmployeeTable; 2. SELECT to_XML(Name,Number) FROM EmployeeTable WHERESalary>30000; 3. SELECT to_XML(Number,Salary) FROM EmployeeTable; 4.SELECT to_XML(Salary,Number) FROM EmployeeTable; 5. SELECTto_XML(Name,Salary,Number) FROM EmployeeTable;

[0059] For each of these statements, according to the prior art, adifferent UDF would be required because no two uses of the “to_XML” UDFhave parameters of the same type in the same order. However, accordingto the preferred embodiment of the present invention a single rowfunction could be written for use in each of these statements. Thus rowfunctions are particularly useful for, but not limited to, functionalitysuch as that used in the above example in which the row function is notrequired to carry out processing according to the actual value of a dataitem, but rather always processes a data item the same irrespective ofits value. Further, in this respect, a row function can be used fordifferent tables, for example “to_XML” as a row function would not carryout any processing particular to “EmployeeTable” and so could also beused for other tables.

[0060] Further, according to the present invention, the parameter listfor a row function specified in a select statement may includeparameters, for example a specific value, in conjunction with an “*”.This enables a row function to be written which is capable, for example,of either providing the output as shown in either FIG. 2b, which uses“XML” as the XML tag, or FIG. 2c, which uses “S” as the XML tag, andwhich is not limited to use with “EmployeeTable”. For example, if the“to_XML” row function assumes the XML tag is “XML” unless it receives aninput parameter, named “XMLTag”, containing a string to use as the XMLtag, it could be used in the following “select” statements:

[0061] 1. SELECT to XML(*) FROM EmployeeTable;

[0062] 2. SELECT to XML(“S” as $XMLTag, *) FROM Employee Table;

[0063] It is then obvious to a person skilled in the art how to write asingle “to_XML” row function such that the first invocation produces theoutput shown in FIG. 2b and the second invocation produces the outputshown in FIG. 2c.

[0064] Preferably a database engine which supports row functions alsoincludes execution time support for specification of parameters for arow function which includes use of a “*” with specific values. However,if this is not the case, or for performance reasons, a prepare phase maybe provided which expands the “*” into a full list of table columns foran execution phase. For example consider the following statement:

[0065] SELECT to XML(“S” as $XMLTag, *) FROM Employee Table;

[0066] The prepare phase would convert the statement, for the table“EmpoyeeTable” shown in FIG. 2a, to:

[0067] SELECT to_XML(“S” as $XMLTag, Name, Number, Salary)

[0068] FROM EmployeeTable;

[0069] Note that, according to the preferred embodiment of the presentinvention, specific values (or variables) are passed to a row functionin the same way as data items, that is as “name, value, type” triples.Such specific values are specified in the “select” statement as:

[0070] <value> as $<name>[is <type>]

[0071] For example: “5 as count is Integer” specifies an integervariable named count with a value of 5.

[0072] Note that “[is <type>]” is shown in [ ] as it is optional and, inthe event it is omitted, a default type is assumed. In the preferredembodiment the default type is VARCHAR (null terminated string). “Name,value, type” triples can then be specified to a row function, forexample, in a select statement:

[0073] SELECT to_XML(“S” as $XMLTag,*);

[0074] Thus the first element in the SQLUDF_ROWDATA structure willcontain a null terminated string (the default) named “$XMLTag” with avalue of “S”. Note that one or more such variables can be passed to arow function and at any position in the input list. These are thenpassed to the row function, with the data items, in the order specified.

[0075] A person skilled in the art would realize that, alternatively, avariable could be passed to the row function by adding it to thefunction prototype and passing it as an individual parameter rather thanin the SQLUDF_ROWDATA structure.

[0076] Optionally, according to the present invention, the preferredembodiment can further include support for row functions which require,for a given request, to know the first, last, and optionallyintermediate times it is called to process the given request. Forexample consider a, row function that is required to produce the outputshown in FIG. 2b, but preceded with a line comprising the string “<ALL>”and followed with a line comprising the string </ALL>. This is achievedin the preferred embodiment by defining the row function with a Cprototype of: void SQL_API_FN to_XMLGroup ( char    *calltype, /* typeof the call   */ int    numdata,   /* number of elements in array */SQLUDF_ROWDATA[] *rowdata,    /* array of data items */char  *output_XML_String /* output field for XML string */ );

[0077] This prototype further includes the “calltype” parameter which isset by the database engine to values which indicate the first,intermediate and last calls to the function which are part of processinga request. Now consider a request to call this function:

[0078] SELECT to_XMLGroup(“ALL” as $Group, *) FROM EmployeeTable;

[0079] The database engine will process this request by first calling“to_XMLGroup” with “calltype” set to indicate the first call to the rowfunction and an SQLUDF_ROWDATA structure containing the value of the“Group” variable and no data items. Following this “to_XMLGroup” will becalled, once for each row of suitable data, with “calltype” set toindicate an intermediate call and an SQLUDF_ROWDATA structure containingthe value of the “group” variable and data items from a single row ofthe table. Finally, after all rows have been processed, “to_XMLGroup” itis called a final time with “calltype” set to indicate the last call tothe row function and an SQLUDF_ROWDATA structure containing the value ofthe “Group” variable.

[0080] Given this calling sequence it would be obvious to a personskilled in the art how to write the “to_XMLGroup” function to producethe required output. Note that the function could produce an outputstring for each call, or could save output from some calls and returnconcatenated strings in later calls. For example “to_XMLGroup” couldreturn null strings to all calls apart from the last from which itreturns the entire output string. Further note that some requests toretrieve data can request that the data items are passed to the rowfunction in groups based on one or more data items. In this case the rowfunction can be called to indicate the first and last calls to the rowfunction for each group of data items and the row function may thereforebe called more than once with last and first call indications in theprocessing of a single request.

[0081] A person skilled in the art would realize that there are manyways of passing “calltype” to the row function. For example in anotherembodiment, it could be added to the SQLUDF_ROWDATA structure. Furtherif the request results in the row function being called for more thanone row of data, “calltype” can be set to indicate the “first” call whenpassing the row function the first row of data and to indicate “last”call when passing the last row of data. If the request results in thefunction being called for only one row of data, “calltype” can be set toindicate that the function will be called only once.

[0082] A person skilled in the art would realize that there are manydifferent ways of defining the SQLUDF_ROWDATA structure. The memberscould be in a different order or of a different type. For example anyfield could be defined as any type of pointer or any type of fixedlength field, although some choices offer better solutions than others.Further a length member can be included to indicate the length of thedata in the “value” member and one or more members can be included inone or more sub structures within the SQLUDF_ROWDATA structure.

[0083] A person skilled in the art would also realize that alternativemethods could be used for indicating the number of elements passed tothe “to_XML” function. In the above example the “numdata” variable wasspecified in the C function prototype. One alternative, for example,could be including a extra element at the end of the array which can berecognized in code as a null element. Another alternative, for example,could be adding another member to the “SQLUDF_ROWDATA” structure whichcan be used to mark the last element of the array.

[0084] Further, a person skilled in the art would realize that languagessuch as C provide support for variable-argument functions which arefunctions that accept a varying number of arguments. As a result, ratherthan passing a row function an array of “SQLUDF_ROWDATA” structures itwould be possible, for example, to pass a varying number of“SQLUDF_ROWDATA” structures or even a varying number of individualparameters which, in aggregate, contain the equivalent information as“SQLUDF_ROWDATA” structure for each data item requested.

[0085] Note that the inclusion of the “name” member in the“SQLUDF_ROWDATA” structure is optional because a row function can bewritten for a particular table and be hard coded with the names of thecolumns. However, the advantage of including the column name in the datapassed to the row function is that it is possible to write a generalpurpose row function which can be used with many different tables, forexample a row function that creates XML strings based on any table.

[0086] The preferred method, according to the present invention, of anRDBMS handling a row function will now be discussed with reference toFIGS. 3, 4 and 5. Note that this method does not include indicating to arow function the first and last calls to it.

[0087]FIG. 3 shows the processing of the RDBMS when processing a userrequest to register a row function with the RDBMS. At step 301 a requestis received to register a row function. At step 302 a check is made tosee if the register request is valid. According to the preferredembodiment a registration request may be invalid because the requestcontains errors or the function does not adhere to requirements, such asaccepting the correct inputs which are an SQLUDF_ROWDATA structure, aninteger to contain the number of data columns contained in thestructure, and a single output parameter. If the request is invalid anerror message is output at step 303, but if it is valid at step 304 thedetails of the row function are saved for look up when the function islater required in the processing of a select request.

[0088] Note that in DB2 a UDF is registered using a “CREATE FUNCTION”statement. In the preferred implementation a “ROW” option is added tothose available with the “CREATE FUNCTION” statement to designate a rowfunction. In another embodiment a new statement may be created for thispurpose, such as “CREATE ROWFUNCTION” In another embodiment the RDBMSmay look in the parameter list of the function defined to see if itcontains the “SQLUDF_ROWDATA” structure or the equivalent to thisstructure used in the particular embodiment. It may also be possible toomit the registration procedure altogether if the RDBMS recognises whena row function is used based on it being specified on a “select”statement and has some means of finding it when required such as adefined search path.

[0089]FIG. 4 shows the processing of the RDBMS when processing a requestto retrieve data, such as an SQL “Select” statement. At step 401 arequest to retrieve data from the database is received. At step 402 thefirst row of data that requires processing by the request is obtained.At step 403 the row obtained is processed. This step will be describedmore fully below with reference to FIG. 5. At step 404 a check is madeto see if the last row has been processed, if so the method is stopped,but if not the next row of suitable data is obtained at step 405 andprocessing loops back to step 403.

[0090]FIG. 5 shows the processing for each row of data which must beprocessed as a result of the request. In this process the RDBMS must besensitive to the inclusion of a row function in the request as it mustbuild the required data into the “SQLUDF_ROWDATA”, or equivalent,structure. At step 501 a check is made to see if the request includesuse of a row function. In the preferred embodiment this requires lookingup the specified row function in the details of registered row functionssaved at step 304 of FIG. 3. If a row function is not specified therequest is processed normally, according to the prior art, at step 502.However, if use of a row function is specified processing proceeds tostep 503. At this step the RDBMS builds the data required for processingby the row function into a buffer. In the preferred embodiment thebuffer is defined by an array of “SQLUDF_ROWDATA” structures whichcontain the column name, value, and type for each data item requestedand details of any variables specified, in the request, for passing tothe function. Other embodiments may use a different format of buffer andnot include the column name.

[0091] At this step in the preferred embodiment, the RDBMS also tracksthe number of data items contained in the buffer as this is also passedto the row function. At step 504 the row function is called passing thecontents of the buffer and the number of data items that it contains.Any output produced by the row function is then processed by the RDBMSaccording to the data retrieval request received at step 401 in FIG. 4.As the request may include use of more than one row function, at step505 a check is made to see if this is the case. If another row functionis specified processing returns to step 503. Steps 503, 504, and 505 arethen repeated until all row functions specified in the request have beenprocessed. When all row functions have been processed, processingcontinues to step 502 where continued processing of the requestaccording to the prior art are carried out.

[0092] Note that for convenience, FIG. 5 shows the row function(s)specified in a request being processed first followed by any normalprocessing. In practice, processing of a data retrieval request will notfollow such a rigid pattern and will depend largely on the syntax of theretrieval request. As a result, the order of processing row functionsfollowed by normal processing is not essential to the present invention.

[0093] Further note that the exact sequence of the steps from themethods described in FIGS. 3, and 4 may vary in other embodimentswithout departing from the present invention. For example in FIG. 4, thedatabase calls the row function for individual rows of data. However, itwould be quite possible to use the present invention in an RDBMS thatprovides data from multiple rows to a row function in a singleinvocation.

[0094] Thus the present invention provides a new method for enabling theflexible retrieval of data from a relational database. This is based onfunctions which are generally defined by users and specified in dataretrieval requests to carry out some processing of the data retrieved.According to the present invention these functions are passedinformation containing details of the value, type, and optionally name,for each of the data items it is to process. Thus an RDBMS enables theprovision of functions which can be written independently of the datawhich they are called to process.

What is claimed is:
 1. A data processing method for a database engine toprocess a request, the method comprising the steps: (a) receiving therequest, wherein the request comprises a function that processes atleast one data item specified in the request; (b) retrieving the atleast one data item from a database; and (c) invoking the function,wherein the invoking step comprises passing to the function at least oneparameter, wherein the at least one parameter comprises informationrelated to the at least one data item.
 2. The method of claim 1, whereinthe at least one data item has an associated value and type, and whereinthe information in the at least one parameter comprises the value andtype of the at least one retrieved data item.
 3. The method of claim 1,wherein the request further includes a variable, the variable having anassociated value and name, and wherein the at least one parameterfurther comprises the name and value of the variable.
 4. The method ofclaim 3, wherein the variable further has an associated type and whereinthe at least one parameter further comprises an indication of the typeof the variable.
 5. The method of claim 1, wherein the databasecomprises a table, the table comprises a plurality of rows, and the rowscomprise a plurality of data items, and the one or more retrieved dataitems are from a single row.
 6. The method according of claim 1, whereinthe invoking step (c) is performed one or more times, the method furthercomprising the steps: (d) before the invoking step (c) is performed fora first time, initially invoking the function, wherein the initialinvoking step comprises passing to the function a first parameter, thefirst parameter comprising an indication that the initial invocation isthe first invocation of the function; and (e) after the invoking step(c) is performed for a last time, finally invoking the function, whereinthe final invoking step comprises passing to the function a secondparameter, the second parameter comprising an indication that the finalinvocation is the last invocation of the function.
 7. The method ofclaim 1, wherein the invoking step (c) is performed a plurality oftimes, and wherein the first time the invoking step is performed the atleast one parameter further comprises an indication that the invocationis the first invocation of the function; and the last time the invokingstep is performed the at least one parameter further comprises anindication the invocation is the last invocation of the function.
 8. Themethod of claim 1, wherein the request includes a first value whichindicates that the at least one data item comprises all data items froma table.
 9. The method of claim 1, wherein the step of invoking thefunction (c) is responsive to the function having been previouslyregistered with the database engine.
 10. The method of claim 1, furthercomprising the step, following the invoking step, of: (d) obtaining astring of characters as a return value from the function.
 11. A dataprocessing apparatus for processing a request to access a databasecomprising: means for receiving the request, wherein the requestcomprises a function that processes at least one data item specified bythe request; means for retrieving the at least one data item from adatabase; and invocation means for invoking the function, the invocationmeans comprising means for passing to the function at least oneparameter, wherein the at least one parameter comprises informationrelated to the at least one data item.
 12. The apparatus according toclaim 11, wherein the at least one data item has an associated value andtype, and wherein the information in the at least one parametercomprises the value and type of the at least one retrieved data item.13. The apparatus according to claim 11, wherein the request furtherincludes a variable, the variable having an associated value and name,and wherein the at least one parameter further comprises the name andvalue of the variable.
 14. The apparatus according to claim 13, whereinthe variable further has an associated type and wherein the at least oneparameter further comprises an indication of the type of the variable.15. The apparatus according to claim 11, wherein the database comprisesa table, the table comprises a plurality of rows, and the rows comprisea plurality of data items; and the one or more retrieved data items arefrom a single row.
 16. The apparatus according to claim 11, wherein theinvocation means further comprising: means for determining that thefunction has not previously been called; means, responsive todetermining that the function has not previously been invoked, forpassing to the function a first parameter, wherein the first parametercomprises an indication that the invocation is the first invocation ofthe function; means for determining that the function has beenpreviously invoked and has passed the at least one data item; means,responsive to determining that the function has been invoked and haspassed the at least one data item, for passing to the function a secondparameter, wherein the second parameter comprises an indication that theinvocation is the last invocation of the function.
 17. The apparatusaccording to claim 11, wherein the request includes a first value whichindicates that the at least one data item comprises all data items froma table.
 18. The apparatus according to claim 11, further comprising:means for registering functions; and wherein the means for invoking isresponsive to the function having been previously registered with thedatabase engine.
 19. The apparatus according to claim 11 furthercomprising: means for obtaining a string of characters as a return valuefrom the function.
 20. A computer program product comprisinginstructions which, when executed on a data processing host, cause thehost to carry out a method comprising the steps: (a) receiving therequest, wherein the request comprises a function that processes atleast one data item specified in the request; (b) retrieving the atleast one data item from a database; and (c) invoking the function,wherein the invoking step comprises passing to the function at least oneparameter, wherein the at least one parameter comprises informationrelated to the at least one data item.
 21. The computer program productaccording to claim 20, wherein the at least one data item has anassociated value and type, and wherein the information in the at leastone parameter comprises the value and type of the at least one retrieveddata item.
 22. The computer program product according to claim 20,wherein the request further includes a variable, the variable having anassociated value and name, and wherein the at least one parameterfurther comprises the name and value of the variable.
 23. The computerprogram product according to claim 22, wherein the variable further hasan associated type and wherein the at least one parameter furthercomprises an indication of the type of the variable.
 24. The computerprogram product according to claim 20, wherein the database comprises atable, the table comprises a plurality of rows, and the rows comprise aplurality of data items, and the one or more retrieved data items arefrom a single row.
 25. The computer program product according to claim20, wherein the invoking step (c) is performed one or more times, themethod further comprising the steps: (d) before the invoking step (c) isperformed for a first time, initially invoking the function, wherein theinitial invoking step comprises passing to the function a firstparameter, the first parameter comprising an indication that the initialinvocation is the first invocation of the function; and (e) after theinvoking step (c) is performed for a last time, finally invoking thefunction, wherein the final invoking step comprises passing to thefunction a second parameter, the second parameter comprising anindication that the final invocation is the last invocation of thefunction.
 26. The computer program product according to claim 20,wherein the invoking step (c) is performed a plurality of times, andwherein the first time the invoking step is performed the at least oneparameter further comprises an indication that the invocation is thefirst invocation of the function; and the last time the invoking step isperformed the at least one parameter further comprises an indication theinvocation is the last invocation of the function.
 27. The computerprogram product according to claim 20, wherein the request includes afirst value which indicates that the at least one data item comprisesall data items from a table.
 28. The computer program product accordingto claim 20, wherein the step of invoking the function (c) is responsiveto the function having been previously registered with the databaseengine.